Scalable and Robust Group Discovery on Large Transactional Data

نویسندگان

  • Patrick Pakyan Choi
  • Andrew W. Moore
  • Jeremy Kubica
  • Andrew Moore
چکیده

The need for time-critical analysis and understanding of the underlying group structure from transactional data has been growing in domains such as law enforcement and customs. Kubica et al. (2003) proposed k-groups, an algorithm based on probabilistic generative model for discovering underlying groups in data. Even though k-groups is reported to be signficantly faster than its predecessor GDA (Kubica et al., 2002), k-groups is too slow and memory-intensive for large data in practice. This paper presents XGDA, a framework for scalable and robust group discovery. Evaluation of the performances of XGDA and k-groups shows that XGDA can handle extremely large datasets in reasonable time and yields more robust solutions than k-groups.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

Fast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies

Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...

متن کامل

تأثیر آموزش تحلیل ارتباط محاوره‌ای Berne بر بهبود عملکرد والدین دانش‌آموزان پسر دوره متوسطه شهر رفسنجان

  Background and Objectives: The happiness and felicity of every person to a large extent is related to his/her communication with others. The purpose of this study was to investigate the effectiveness of transactional analysis training on the parent's function of high school boy students in Rafsanjan in 2009.   Materials and Methods: This experimental study was performed on 40 parents of high ...

متن کامل

Intelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms

Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...

متن کامل

Comparing the Effectiveness of Reality Therapy and Transactional Analysis Approach on Stress Coping Strategies in Infertile Women

Introduction: Infertile women face many problems in their personal lives and infertility is accompanied with many psychological injuries including increased stress, which requires to the use of psychological interventions. Therefore, present research aimed to comparing the effectiveness of reality therapy and transactional analysis approach on stress coping strategies in infertile women. Method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005